20 research outputs found

    TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers

    Full text link
    Leveraging the characteristics of convolutional layers, neural networks are extremely effective for pattern recognition tasks. However in some cases, their decisions are based on unintended information leading to high performance on standard benchmarks but also to a lack of generalization to challenging testing conditions and unintuitive failures. Recent work has termed this "shortcut learning" and addressed its presence in multiple domains. In text recognition, we reveal another such shortcut, whereby recognizers overly depend on local image statistics. Motivated by this, we suggest an approach to regulate the reliance on local statistics that improves text recognition performance. Our method, termed TextAdaIN, creates local distortions in the feature map which prevent the network from overfitting to local statistics. It does so by viewing each feature map as a sequence of elements and deliberately mismatching fine-grained feature statistics between elements in a mini-batch. Despite TextAdaIN's simplicity, extensive experiments show its effectiveness compared to other, more complicated methods. TextAdaIN achieves state-of-the-art results on standard handwritten text recognition benchmarks. It generalizes to multiple architectures and to the domain of scene text recognition. Furthermore, we demonstrate that integrating TextAdaIN improves robustness towards more challenging testing conditions. The official Pytorch implementation can be found at https://github.com/amazon-research/textadain-robust-recognition.Comment: 12 pages, 8 figures, Accepted to ECCV 202

    Sequence-to-Sequence Contrastive Learning for Text Recognition

    Get PDF
    We propose a framework for sequence-to-sequence contrastive learning (SeqCLR) of visual representations, which we apply to text recognition. To account for the sequence-to-sequence structure, each feature map is divided into different instances over which the contrastive loss is computed. This operation enables us to contrast in a sub-word level, where from each image we extract several positive pairs and multiple negative examples. To yield effective visual representations for text recognition, we further suggest novel augmentation heuristics, different encoder architectures and custom projection heads. Experiments on handwritten text and on scene text show that when a text decoder is trained on the learned representations, our method outperforms non-sequential contrastive methods. In addition, when the amount of supervision is reduced, SeqCLR significantly improves performance compared with supervised training, and when fine-tuned with 100% of the labels, our method achieves state-of-the-art results on standard handwritten text recognition benchmarks

    CLIPTER: Looking at the Bigger Picture in Scene Text Recognition

    Full text link
    Reading text in real-world scenarios often requires understanding the context surrounding it, especially when dealing with poor-quality text. However, current scene text recognizers are unaware of the bigger picture as they operate on cropped text images. In this study, we harness the representative capabilities of modern vision-language models, such as CLIP, to provide scene-level information to the crop-based recognizer. We achieve this by fusing a rich representation of the entire image, obtained from the vision-language model, with the recognizer word-level features via a gated cross-attention mechanism. This component gradually shifts to the context-enhanced representation, allowing for stable fine-tuning of a pretrained recognizer. We demonstrate the effectiveness of our model-agnostic framework, CLIPTER (CLIP TExt Recognition), on leading text recognition architectures and achieve state-of-the-art results across multiple benchmarks. Furthermore, our analysis highlights improved robustness to out-of-vocabulary words and enhanced generalization in low-data regimes.Comment: Accepted for publication by ICCV 202

    European Neuromuscular Centre consensus statement on anaesthesia in patients with neuromuscular disorders

    No full text
    BACKGROUND Patients with neuromuscular conditions are at increased risk of suffering peri-operative complications related to anaesthesia. There is currently little specific anaesthetic guidance concerning these patients. Here we present the European Neuromuscular Centre (ENMC) consensus statement on anaesthesia in patients with neuromuscular disorders as formulated during the 259th ENMC workshop on Anaesthesia in neuromuscular disorders. METHODS International experts in the field of (paediatric) anaesthesia, neurology and genetics were invited to participate in the ENMC workshop. A literature search was conducted in PubMed and EMBASE whose main findings were disseminated to the participants and presented during the workshop. Depending on specific expertise, participants presented the existing evidence and their expert opinion concerning anaesthetic management in six specific groups of myopathies and neuromuscular junction disorders. The consensus statement was prepared according to the Appraisal of Guidelines for REsearch & Evaluation (AGREE II) reporting checklist. The level of evidence has been adapted according to the Scottish Intercollegiate Guidelines Network (SIGN) grading system. The final consensus statement was subjected to a modified Delphi process. RESULTS A set of general recommendations valid for the anaesthetic management of patients with neuromuscular disorders in general have been formulated. Specific recommendations were formulated for 1) neuromuscular junction disorders; 2) muscle channelopathies (non-dystrophic myotonia and periodic paralysis); 3) myotonic dystrophy (type 1 and 2); 4) muscular dystrophies; 5) congenital myopathies and congenital dystrophies and 6) mitochondrial and metabolic myopathies. CONCLUSION This ENMC consensus statement summarizes the most important considerations for planning and performing anaesthesia in patients with neuromuscular disorders

    The European Neuromuscular Centre Consensus Statement on Anaesthesia in Patients with Neuromuscular Disorders.

    No full text
    BACKGROUND Patients with neuromuscular conditions are at increased risk of suffering peri-operative complications related to anaesthesia. There is currently little specific anaesthetic guidance concerning these patients. Here we present the European Neuromuscular Centre (ENMC) consensus statement on anaesthesia in patients with neuromuscular disorders as formulated during the 259th ENMC workshop on Anaesthesia in neuromuscular disorders. METHODS International experts in the field of (paediatric) anaesthesia, neurology and genetics were invited to participate in the ENMC workshop. A literature search was conducted in PubMed and EMBASE whose main findings were disseminated to the participants and presented during the workshop. Depending on specific expertise, participants presented the existing evidence and their expert opinion concerning anaesthetic management in six specific groups of myopathies and neuromuscular junction disorders. The consensus statement was prepared according to the Appraisal of Guidelines for REsearch & Evaluation (AGREE II) reporting checklist. The level of evidence has been adapted according to the Scottish Intercollegiate Guidelines Network (SIGN) grading system. The final consensus statement was subjected to a modified Delphi process. RESULTS A set of general recommendations valid for the anaesthetic management of patients with neuromuscular disorders in general have been formulated. Specific recommendations were formulated for 1) neuromuscular junction disorders; 2) muscle channelopathies (non-dystrophic myotonia and periodic paralysis); 3) myotonic dystrophy (type 1 and 2); 4) muscular dystrophies; 5) congenital myopathies and congenital dystrophies and 6) mitochondrial and metabolic myopathies. CONCLUSION This ENMC consensus statement summarizes the most important considerations for planning and performing anaesthesia in patients with neuromuscular disorders
    corecore